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Abstract 

This  paper  describes  preliminary  results  and  outlines  the 
plan  of  attack  for  an  on-going  project  to  develop  a  prac¬ 
tical  system  for  verifying  outsourced  computations.  We 
want  a  client  to  be  able  to  describe  a  computation  to  a 
server,  get  back  a  purported  output  and  some  auxiliary 
information,  and  use  that  auxiliary  information  to  verify 
that  the  output  is  correct.  For  outsourcing  to  be  worth¬ 
while,  the  verification  process  should  be  substantially 
more  efficient  than  simply  executing  the  computation. 

Our  approach  to  this  problem  is  to  exploit  the  theory  of 
probabilistically  checkable  proofs  (PCPs).  Specifically, 
our  project  seeks  to  build  a  bridge  between  the  theory 
and  an  implementable  system.  We  describe  a  protocol 
for  outsourced  computation  that  includes  algorithmic  re¬ 
finements  of  the  PCP  protocol  and  end-to-end  instantia¬ 
tion  of  the  necessary  steps  (e.g.,  compilation  to  a  form 
suitable  for  application  of  the  PCP  theorem).  Although 
we  are  in  the  process  of  implementing  the  protocol  and 
do  not  have  experimental  results,  we  present  a  detailed 
analysis  that  provides  cause  for  optimism:  our  cycle  and 
memory  usage  costs  strongly  suggest  that  our  system  will 
be  useful.  We  focus  on  the  example  of  matrix  multiplica¬ 
tion,  where  we  show  that  for  large  matrices,  our  method 
achieves  enormous  savings  for  the  client  and  requires 
feasible  amounts  of  bandwidth. 

1  Introduction  and  motivation 

This  paper  describes  preliminary  results  and  outlines 
the  plan  of  attack  for  an  ongoing  project  to  develop  a 
practical  system  for  verifying  outsourced  computations. 
Broadly  speaking,  we  are  interested  in  computations 
that  are  too  expensive  for  the  client  to  perform  locally, 
and  do  not  admit  obvious  procedures  for  verifying  the 
correctness  of  a  purported  solution.  Such  computations 
range  from  complicated  numerical  algorithms  operating 
on  large  matrices  (which  are  polynomial  but  expensive) 
to  NP-hard  search  problems  where  even  the  polynomial 
time  check  of  the  solution  is  too  costly.  Furthermore,  we 
want  to  make  only  weak  assumptions  about  the  possi¬ 
ble  misbehavior  of  servers:  we  do  not  want  to  rely  on 
replication  methods  but  instead  desire  efficiently  verifi¬ 
able  proofs  of  correctness. 

Specifically,  our  goal  is  to  realize  the  following  high- 
level  scheme,  depicted  in  Fig.  1 :  a  client  sends  a  descrip¬ 


tion  of  a  computation,  P,  to  a  server  (for  example,  in 
the  form  of  a  C  program);  the  server  executes  P  and  re¬ 
turns  the  claimed  output  and  some  auxiliary  information; 
the  client  uses  the  auxiliary  information  to  verify  that 
the  output  is  correct;  and  the  verification  is  substantially 
more  efficient  than  simply  executing  the  computation. 

As  a  motivating  scenario,  consider  a  computation¬ 
ally  limited  device  that  wants  to  offload  processing 
to  the  cloud  [14],  For  example,  a  smartphone  might 
wish  to  outsource  an  expensive  photographic  manip¬ 
ulation  for  want  of  computational  cycles  [17].  How¬ 
ever,  the  device  owner  may  not  be  willing  to  assume 
the  correctness  of  the  cloud.  As  another  scenario,  some 
30  projects  (Seti@home,  Folding@home,  the  Mersenne 
prime  search,  etc.)  use  the  BOINC  software  platform  [2, 
3]  to  leverage  the  spare  cycles  of  volunteers’  computers 
to  perform  massive  computations  that  would  otherwise 
be  infeasible.  Unfortunately,  a  problem  is  that  some  “vol¬ 
unteers”  run  modified  software  that  does  not  compute  the 
answers  correctly  [4],  Today,  these  projects  check  vol¬ 
unteers’  work  by  outsourcing  the  same  computation  to 
multiple  hosts,  but  this  approach  does  not  protect  against 
clients  that  are  colluding  or  simply  buggy.  It  would  be 
far  preferable  if  the  central  project  computers  could  ver¬ 
ify  the  correctness  of  a  volunteer’s  purported  answer. 

Our  approach  to  this  problem  is  to  exploit  the  remark¬ 
able  work  on  the  theory  of  probabilistically  checkable 
proofs  (PCPs)  [7,  16,  20,  32],  The  central  theorem  in 
the  subject  is  that  any  language  in  Af'P  admits  proofs 
of  membership  that  can  be  verified  by  checking  a  very 
small  number  of  bits  in  the  proof.  Naturally,  the  fact  that 
PCPs  might  provide  a  solution  to  the  precise  problem  of 
verifiable  outsourced  computing  has  not  been  lost  on  the 
theorists  working  in  the  area. 

However,  as  they  appear  in  the  theoretical  literature, 
PCPs  are  not  suitable  for  use  in  a  real  system  for  out¬ 
sourcing:  as  we  explain  below  in  Section  3,  the  constants 
and  overhead  costs  of  certain  aspects  of  the  protocol 
make  naive  deployment  infeasible.  Broadly  speaking,  the 
central  problem  is  the  fact  that  the  proofs  are  far  too  large 
to  practically  transmit  or  even  for  the  verifier  to  instanti¬ 
ate.  Indeed,  the  folklore  is  that  PCPs  are  not  really  suit¬ 
able  for  practical  systems  right  now.  As  a  consequence, 
there  has  been  a  flurry  of  work  that  addresses  limited  and 
specific  classes  of  problems  [5,  6,  21,  38-40]  or  makes 
strong  trusted  hardware  assumptions  [13,  22,  31,  35,  36], 
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Figure  1 — High-level  depiction  of  verified  outsourced  compu¬ 
tation.  P  is  the  computation,  x  is  the  input,  y  is  the  purported 
output,  and  p  is  auxiliary  information. 

In  contrast,  our  point  of  departure  is  to  carefully  re¬ 
examine  the  grounds  for  this  skepticism:  we  want  to  as¬ 
sess  the  feasibility  of  building  a  bridge  between  the  the¬ 
ory  and  an  implementable  system  for  general  purpose 
verifiable  outsourced  computation.  We  are  not  the  first 
to  revisit  this  issue:  recently,  there  have  been  attempts  to 
alleviate  the  costs  of  generic  PCP  [8,  20],  with  a  view  to¬ 
ward  verifiable  outsourcing.  However,  to  date  they  have 
not  been  specified  enough  to  be  implementable. 

By  focusing  on  practical  implementation  from  end  to 
end,  we  have  been  forced  to  flesh  out  the  details  and  grap¬ 
ple  with  the  constraints  imposed  by  a  real  system.  Specif¬ 
ically,  our  project  makes  the  following  contributions: 

1.  Design  and  specification.  In  order  to  design  a  prac¬ 
tical  system,  we  have  developed  a  number  of  refine¬ 
ments  to  the  PCP  protocol  (discussed  in  more  detail  in 
Section  3).  To  be  clear,  we  are  not  claiming  substan¬ 
tial  theoretical  contributions.  Rather,  our  innovation  is 
in  applying  techniques  (carefully  constructed  Merkle 
trees,  batching,  etc.)  to  reduce  proof  size  and  to  amor¬ 
tize  the  cost  of  encoding  problems  in  a  format  that  is 
suitable  for  the  theoretical  machinery  to  apply.  Using 
these  refinements,  we  specify  an  end-to-end  protocol 
for  verifiable  outsourcing. 

2.  Feasibility  analysis.  In  Section  4,  we  perform  a  de¬ 
tailed  analysis  of  our  protocol  in  the  context  of  ma¬ 
trix  multiplication  over  arbitrary  fields,  a  concrete 
problem  that  is  a  central  primitive  for  machine  learn¬ 
ing  and  data  mining  applications,  and  an  example 
of  a  numerical  computation  that  BOINC  distributes. 
Our  results  show  that  for  large  matrices,  our  method 
achieves  enormous  savings  for  the  client  and  requires 
feasible  amounts  of  bandwidth. 

3.  Implementation  roadmap.  As  we  are  still  imple¬ 
menting,  we  cannot  present  experimental  results. 
However,  in  Section  5  we  present  a  path  toward  fully 
implementing  our  approach,  and  extending  it  to  a 
broader  range  of  problems.  This  long-term  program 
consists  both  of  clearly  achievable  “get  it  working” 
problems,  as  well  as  research  questions. 

Thus,  our  results  indicate  that  what  seemed  like  a  risky 
goal  (at  least  to  us)  appears  to  be  plausible.  If  we’re  right, 
then  this  opens  the  door  to  practical  use  of  PCPs,  which 
would  be  important  for  real  networked  systems,  includ¬ 
ing  classic  distributed  systems  and  data  centers. 


2  Related  work 

This  section  describes  prior  work  on  verifiable  out¬ 
sourced  computations.  We  begin  with  work  that  shares 
our  top-level  summary:  PCPs  to  verify  computations. 

PCPs  to  verify  computations.  Babai  et  al.  give  a  pro¬ 
tocol  to  verify  computations  in  polylogarithmic  time  us¬ 
ing  PCPs  [8],  inspired  by  [11].  Goldwasser  et  al.  [20] 
sketch  a  protocol  based  on  interactive  proofs  to  dele¬ 
gate  computations;  their  scheme  is  asymptotically  effi¬ 
cient  for  the  prover  and  the  verifier.  Unfortunately,  these 
papers  do  not  specify  the  schemes  sufficiently  for  their 
practical  viability  to  be  clear;  notably,  the  constants  that 
would  be  associated  with  actual  reification  of  the  pro¬ 
tocols  are  unknown.  Indeed,  as  mentioned  in  the  intro¬ 
duction,  one  of  our  purposes  here  is  to  make  the  case 
for  practical  viability  by  carefully  tracing  through  the  re¬ 
quired  costs  of  a  concrete  and  fully-specified  protocol. 

Secure  multi-party  computation.  Another  approach 
to  verifiable  outsourcing  comes  from  secure  multi-party 
computations.  In  such  protocols,  two  (or  more)  mutually 
untrusting  parties  can  compute  an  agreed-upon  function 
of  private  data  in  a  way  that  reveals  publicly  only  the 
result,  keeping  the  private  data  secret  [10,  12,  19,  42], 
By  themselves,  these  protocols  do  not  provide  verifiable 
outsourced  computations.  However,  Gennaro  et  al.  [17] 
combine  Yao’s  construction  with  Gentry’s  breakthrough 
results  on  homomorphic  encryption  [18]  to  provide  ver¬ 
ifiable  non-interactive  computing.  Their  construction  is 
asymptotically  efficient  (in  that  it  runs  in  polynomial 
time),  but  it  inherits  from  Gentry’s  scheme,  which  is  not 
yet  practical  to  implement  at  scale  on  today’s  computers. 

Even  when  considering  only  the  secure  multi-party 
protocols,  the  costs  of  expressing  general  computations 
in  terms  of  Yao’s  garbled  circuit  constructions  and  the 
attendant  oblivious  transfer  protocols  are  prohibitive. 
Indeed,  the  Fairplay  system  [9,  27]  combines  Yao’s 
construction  with  innovative  compilation  techniques  to 
transform  programs  written  in  a  subset  of  C  into  secure 
multi-party  computations.  But  Fairplay,  although  a  tech¬ 
nical  tour  de  force,  is  essentially  unusable  for  even  rela¬ 
tively  small  problem  instances.  Although  there  have  been 
subsequent  refinements  of  Fairplay,  the  situation  has  not 
qualitatively  improved  (see  [34]  for  a  concrete  discussion 
of  the  costs  of  Fairplay  and  its  descendants  in  the  context 
of  computing  the  intersection  of  two  sets). 

Special-purpose  protocols.  As  mentioned  above,  a 
number  of  works  verify  outsourced  computations  but 
are  tailored  to  specific  problem  domains  and  encom¬ 
pass  a  circumscribed  range  of  functions,  e.g.,  database 
queries  [39,  40],  benchmarks  [5],  and  linear  algebra  op¬ 
erations  [6,  21],  In  contrast,  our  ultimate  goal  is  a  proto¬ 
col  that  supports  functions  encoded  as  C  programs. 

Trusted  computing.  Some  work  assumes  trusted 
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components  on  the  prover:  a  trusted  platform  module 
(TPM),  secure  hardware  token,  hypervisor,  or  runtime 
platform  [13,  22,  26,  29,  31,  35-37].  These  assumptions 
do  not  hold  in  all  environments;  we  want  any  computer 
to  be  able  to  be  a  prover. 

Auditing.  In  some  works,  the  verifier  reruns  the  out¬ 
sourced  computation  on  a  fraction  of  the  input  [31,  41], 
This  does  not  protect  against  an  adversary  that  misreports 
a  small  number  of  strategic  outputs  and  evades  the  audit. 
PCPs,  in  contrast,  spread  the  entire  output  over  the  proof, 
so  that  even  small  deviations  between  the  purported  out¬ 
put  and  the  correct  answer  break  the  proof.  Another  au¬ 
diting  approach  is  for  the  verifier  to  outsource  the  same 
computation  to  multiple  machines  [2].  However,  this  ap¬ 
proach  assumes  that  a  majority  of  the  provers  are  both 
bug-free  and  honest.  We  would  prefer  not  to  make  such 
assumptions.  Finally,  Pioneer  [38]  audits  an  untrusted 
worker  machine  based  on  the  time  taken  to  perform  a 
computation.  This  technique  needs  detailed  knowledge 
of  the  worker’s  hardware;  as  mentioned  above,  we  would 
like  to  avoid  assumptions  about  the  prover’s  platform. 

3  Approach 

In  this  section,  we  give  an  overview  of  our  approach,  and 
then  discuss  refinements.  Our  core  PCP  machinery  fol¬ 
lows  a  construction  in  [7];  the  details  of  this  construction 
are  in  Appendix  C.  Below,  we  use  verifier  to  mean  the  en¬ 
tity  that  describes  the  computation  and  verifies  the  proof, 
and  prover  to  mean  the  entity  that  executes  the  computa¬ 
tion  and  issues  the  proof. 

3.1  Overview 

We  begin  with  a  seemingly  trivial  observation:  if  the 
prover  gave  the  verifier  a  complete  execution  trace  of 
the  desired  computation,  including  both  the  input  and 
the  output,  and  if  the  output  in  the  trace  were  precisely 
the  claimed  output,  then  that  execution  trace  would  con¬ 
stitute  a  proof  to  the  verifier  that  the  computation  was 
carried  out  correctly.  Of  course,  verifying  and  receiving 
this  execution  trace  would  require  more  work  than  sim¬ 
ply  executing  the  computation.  Thus,  we  need  a  way  for 
the  prover  to  issue  a  proof  to  the  verifier  but  with  the  ver¬ 
ifier  able  to  check  the  proof  by  inspecting  it  in  only  a  few 
places.  This  structure  is  precisely  what  PCPs  enable. 

We  now  give  brief  background  on  PCPs.  Many  of 
the  PCP  constructions  in  the  literature  (see  [7,  32]  and 
citations  therein)  concern  3-SAT:  they  allow  a  prover 
to  prove  that  a  given  Boolean  formula,  fi,  in  3-CNF 
has  a  satisfying  assignment.  (A  3-CNF  Boolean  formula 
contains  True-or-False  Boolean  literals  b\,...,bn,  and 
is  a  conjunction  of  clauses  of  the  form,  for  example, 
bi  V  bj  V  bk.  A  satisfying  assignment  is  a  setting  of  each  of 
the  b,  to  True  or  False  such  that  every  clause,  and  hence 
the  entire  formula,  evaluates  to  True.)  Of  course,  the  as¬ 


signment  itself,  a,  constitutes  an  (obvious)  proof  that  fi 
has  a  satisfying  assignment:  the  verifier  could  plug  in  a 
in  every  clause  in  fi. 

The  surprising  content  of  the  PCP  theorem,  however, 
is  that  in  fact  we  can  construct  a  compact  proof  that  can 
be  verified  with  arbitrarily  high  probability  by  inspecting 
a  relatively  small  number  of  (randomly  chosen)  bits.  In 
the  case  of  3-SAT,  the  PCP  is  a  highly  redundant  encod¬ 
ing  of  a  that  spreads  the  information  content  of  a  over 
the  entire  proof.  As  a  result,  given  fi,  the  verifier  can 
check  just  a  few  bits  in  the  PCP  to  be  convinced  (with 
a  bounded  error  probability)  that  fi  is  satisfiable. 

We  now  connect  PCPs  to  the  intuition  above  about 
execution  traces.  Consider  the  representation  of  P  as  a 
Boolean  circuit,  Cp.  A  Boolean  circuit  is  a  set  of  inter¬ 
connected  gates,  each  of  which  has  input  wires  and  an 
output  wire.  P’s  input,  x,  will  appear  as  values  assigned 
to  the  input  wires  of  some  of  the  “early  stage”  gates  of 
Cp,  and  likewise  P’s  output,  y,  will  appear  as  the  values 
of  the  output  wires  from  “late  stage”  gates  of  Cp.  More 
generally,  a  True-False  assignment  to  all  of  the  wires  in 
Cp  is  precisely  an  execution  trace  of  P.  Also,  Cp,  a  col¬ 
lection  of  Boolean  literals,  is  equivalent  to  some  Boolean 
formula,  (f>p,  in  3-CNF.  Thus,  a  satisfying  assignment  to 
fip,  ap,  is  a  valid  execution  trace  of  P.1  Moreover,  dp 
includes  literals  that  correspond  to  x  and  y. 

At  this  point,  we  are  ready  to  apply  PCPs:  given  fip, 
the  prover  issues  a  PCP  that  fip  is  satisfiable.  Then  the 
verifier  carries  out  the  following  steps: 

5 1  The  verifier  inspects  the  PCP  in  a  few  places  to  check 
its  validity;  by  the  equivalence  above,  this  establishes 
that  the  prover  has  a  valid  execution  trace  for  P. 

52  The  verifier  must  establish  that  the  execution  trace  is 
based  on  the  supplied  input,  x.  But  an  assignment  to 
the  “input  literals”  is  encoded  in  dp,  and  the  verifier 
can  (using  a  small  number  of  random  queries)  check 
whether  these  literals  match  x. 

53  The  verifier  must  now  extract  the  output  of  the  execu¬ 
tion  trace,  y;  this  proceeds  in  the  same  fashion  as  the 
previous  step. 

(Note  that  this  is  a  randomized  protocol,  and  there  is  a 
probability  of  error  in  this  process,  but  as  usual  it  can  be 
made  arbitrarily  small  by  repetition;  see  §4.) 

The  work  of  our  project  is  to  refine  the  approach  in 
several  ways,  as  described  in  the  rest  of  this  section: 

•  How  can  we  avoid  the  entire  (huge)  proof  passing 
from  prover  to  verifier?  (§3.2.) 

•  Given  P,  writing  down  fip  is  as  expensive  as  sim¬ 
ply  executing  P,  which  seems  to  defeat  the  purpose 


1  The  observation  that  a  program’s  execution  can  be  encoded  as  a 
satisfiability  instance  is  at  the  heart  of  Cook’s  proof  that  satisfiability  is 
ATP-complete  [15]. 
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of  outsourcing.  Thus,  how  can  we  amortize  and  mit¬ 
igate  this  cost,  or  at  least  avoid  the  verifier  incurring 
it?  (§3.3.) 

•  How  can  the  verifier  efficiently  extract  only  the  output 
literals  from  the  proof?  (§3.4.) 

•  How  exactly  can  we  go  from  a  high-level  description 
of  a  computation,  P,  into  a  3-CNF  representation,  dpi 
Here,  we  do  not  innovate  but  just  report  our  proposed 
approach,  and  why  it  ought  to  be  practical.  (§3.5.) 

3.2  Reducing  network  costs 

As  so  far  described,  the  protocol  requires  the  verifier  to 
sink  a  proof  that  is  polynomial  in  the  number  of  literals 
of  the  Boolean  formula.  To  drastically  reduce  network 
costs,  the  prover  can  commit  to  a  digest  of  the  proof. 
For  each  location  in  the  proof  that  the  verifier  wants  to 
inspect,  it  interactively  queries  the  prover.  The  prover’s 
responses  must  be  consistent  with  the  digest. 

Although  this  high-level  idea  seems  straightforward, 
implementing  it  efficiently  in  our  context  requires  some 
care.  Below,  we  describe  four  successive  modifications, 
each  reducing  network  costs. 

First,  as  Kilian  [24,  25]  suggests,  the  prover  encodes 
the  PCP  as  a  Merkle  tree  [30];  the  digest  is  the  tree’s  root. 
Specifically,  each  leaf  in  the  tree  is  a  collision-resistant 
hash  of  some  portion  of  the  proof  and  the  interior  nodes 
are  collision-resistant  hashes  of  their  children.  To  inspect 
a  location,  /,  in  the  PCP,  the  verifier  submits  /  to  the 
prover.  The  prover  not  only  responds  with  the  contents  of 
l  but  also  produces  a  path  through  the  tree  that  proves  that 
the  contents  were  an  input  to  the  original  digest.  While 
this  core  idea  is  helpful,  it  was  proposed  in  the  context 
of  generic  PCP,  rather  than  our  specific  scenario  of  veri¬ 
fying  outsourced  computation.  In  our  scenario,  steps  S2 
and  S3  require  the  verifier  to  retrieve  all  of  the  input  and 
output  literals,  which,  naively,  would  require  an  interac¬ 
tion  for  each  literal,  which  would  be  infeasible. 

Thus,  our  second  modification  is  to  rearrange  the 
Merkle  tree  so  that  the  input  literals  are  covered  by  a  sin¬ 
gle  leaf;  we  do  likewise  with  the  output  literals.  Now, 
retrieving  the  prover’s  claimed  values  for  the  input  lit¬ 
erals  requires  one  invocation  of  the  interactive  protocol, 
and  likewise  retrieving  y.  To  further  reduce  the  costs,  ob¬ 
serve  that  the  verifier  begins  with  x,  so  step  S2  consists 
only  of  the  verifier  checking  that  the  hash  of  x  was  an 
input  to  the  digest;  x  itself  need  no  longer  travel. 

Third,  observe  that  the  verifier  doesn’t  even  have  to 
have  x  in  hand;  all  that  step  S2  really  requires  is  knowl¬ 
edge  of  the  hash  of  x.  This  observation  offers  significant 
network  and  memory  gains  to  the  verifier:  now  computa¬ 
tions  can  be  outsourced  even  when  the  verifier  does  not 
know  the  input ,  as  might  be  the  case  for,  say,  a  large  data 
mining  application.  Of  course,  applying  this  insight  re¬ 


quires  a  way  for  the  verifier  to  receive  the  hash  of  the 
input  from  a  source  that  it  trusts,  out  of  band. 

Fourth,  sometimes  in  step  SI,  the  verifier  randomly 
chooses  to  inspect  the  proof  locations  that  contain  the 
input  literals.  In  those  cases,  given  our  current  Merkle 
tree  structure,  the  prover  must  send  the  full  input  to  the 
verifier.  This  step  could  be  costly.  To  make  it  cheaper,  our 
final  modification  is  to  change  the  leaf  node  that  covers 
the  input  literals  from  a  flat  hash  of  x  to  a  Merkle  tree 
encoding  of  x  (and  the  out  of  band  hash  follows  suit). 

3.3  Amortizing  setup  costs 

For  any  computation  P ,  writing  down  a  Boolean  cir¬ 
cuit  equivalent  to  P ,  and  hence  writing  down  the  3-CNF 
Boolean  formula  dp,  takes  as  much  time  as  executing  P 
and  at  least  as  much  space  (likely  more).  Thus,  for  out¬ 
sourcing  to  be  worthwhile,  the  verifier  must  be  able  to 
amortize  these  costs,  and  ideally  avoid  them  altogether. 
Below  we  describe  three  ways  that  the  verifier  can  do  so. 

First,  if  the  verifier  will  verify  the  same  computa¬ 
tion  with  different  inputs  (as  in  the  BOINC  [2]  exam¬ 
ples  given  in  the  introduction),  it  can  “reuse”  (f>P ,  thereby 
amortizing  the  work  to  realize  (f>P .  This  observation  is 
not  totally  trivial:  the  reuse  works  because  the  PCP  con¬ 
struction  that  we  use  (see  Appendix  C  and  [7])  allows  the 
verifier  to  conduct  verification  based  only  on  the  PCP, 
on  x ,  and  on  values  that  can  be  quickly  computed  from 
<!>p.  (Simplifying  slightly,  these  values  are  derived  by  en¬ 
coding  fiP  as  a  polynomial  g0/J,  choosing  some  random 
values  {r}  and  then  returning  {,?<£,,(/)}.)  Thus,  with  our 
protocol,  the  verifier  would  incur  a  time  cost  equivalent 
to  executing  P  once  but  thereafter  save  work,  since  veri¬ 
fication  is  far  cheaper  than  executing  P. 

Second,  we  need  to  save  the  verifier  space:  dp  is  likely 
larger  than  the  scratch  space  that  the  verifier  would  have 
needed  to  execute  P.  As  an  example,  for  m  x  m  matrix 
multiplication,  discussed  in  detail  in  §4.1,  the  size  of  fiP 
is  proportional  to  in  '  while  the  space  to  simply  execute  P 
is  proportional  to  nr. 

To  save  the  verifier  space,  we  observe  that  the  veri¬ 
fier  doesn’t  need  to  retain  dp,  provided  it  has  access  to  a 
small  set  {g^  (r)},  for  random  values  of  r;  we  call  this 
set  a  fingerprint  of  (j>P.  Thus,  consider  this  scenario:  a 
verifier  (e.g.,  a  BOINC  project)  pays  the  time  cost  once 
to  realize  fiP.  Then,  the  verifier  pre-computes  g(,,p  at  ev¬ 
ery  point  in  its  domain  (which  costs  time  that  is  low- 
degree  polynomial  in  the  size  of  fiP ).  The  verifier  then 
constructs  a  Merkle  tree  of  these  values,  stores  the  value 
of  the  root  node  locally,  and  stores  the  pre -computed  gPPP 
values  in  a  cloud  storage  service  [1],  Now  the  verifier  can 
throw  away  dp,  and  verification  requires  little  space:  the 
verifier  simply  retrieves  randomly  selected  g,,,p  values  as 
needed.  The  verifier  is  protected  against  the  storage  ser¬ 
vice  returning  bogus  values  because  the  verifier  knows 
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the  digest  of  the  g^p  values. 

Our  third  approach  is  to  remove  even  the  time  and 
space  cost  of  briefly  materializing  fp.  The  simple  ob¬ 
servation  is  that  the  verifier  never  needs  to  handle  (bp,  if 
it  can  get  a  valid  fingerprint  from  a  source  that  it  trusts 
(e.g.,  a  registry  mapping  known  P  to  pre -computed  g 

3.4  Extracting  the  inputs  and  outputs 

A  feature  of  the  version  of  the  PCP  protocol  that  we 
use  is  that  the  input  and  output  literals  can  be  easily 
extracted  from  a  particular  location  in  the  proof  essen¬ 
tially  in  plaintext.  In  order  to  protect  against  a  malicious 
prover,  of  course  we  cannot  exploit  this  fact  and  must 
instead  use  a  more  involved  protocol  to  coalesce  the  val¬ 
ues  of  the  input  and  output  literals  from  random  queries 
of  the  proof.  However,  for  the  purpose  of  the  analysis  in 
this  paper,  we  assume  a  threat  model  in  which  the  prover 
is  unreliable  but  not  malicious.  That  is,  we  assume  that 
attacks  in  which  the  proof  is  generated  correctly  and  is 
accepted  by  the  verifier  but  has  corrupted  “naive”  output 
literals  do  not  occur.  In  work  in  progress,  we  are  devel¬ 
oping  efficient  schemes  to  remove  this  restriction  (§5). 

3.5  Converting  programs  into  3-CNF 

So  far,  we  have  been  assuming  that  some  entity  (the  veri¬ 
fier  or  its  delegate)  can,  given  a  description  of  P  in  a  high- 
level  language,  realize  (j)P,  a  Boolean  formula  in  3-CNF. 
We  now  discuss  the  methods  by  which  this  can  be  ac¬ 
complished  and  the  costs.  We  describe  the  work  as  being 
done  by  the  verifier,  even  though,  as  mentioned  above, 
the  work  might  be  done  by  a  delegate. 

At  a  high  level,  the  main  problem  is  to  go  from  P  to  the 
circuit  Cp.  (To  go  from  Cp  to  (f>P  is  the  easy  part:  by  fol¬ 
lowing  established  techniques  [28],  the  verifier  can  trans¬ 
form  Cp  into  a  3-CNF  formula  that  has  no  more  than  five 
times  as  many  clauses  and  five  times  as  many  literals  as 
Cp  has  gates  [28].) 

To  generate  Cp,  one  possibility  is  that  the  verifier  can 
produce  the  circuit  by  hand.  This  is  not  as  ridiculous  as 
it  sounds,  especially  for  the  types  of  linear  algebra  appli¬ 
cations  that  are  our  initial  target  applications;  indeed,  in 
Section  4,  we  do  precisely  such  a  compilation  by  hand. 
It  is  tractable  in  part  because  the  loop  structure  of  matrix 
multiplication  is  so  straightforward  that  unrolling  it  into 
a  circuit  is  easy.  More  generally,  there  are  many  compu¬ 
tations  that  can  be  expressed  as  Boolean  circuits. 

However,  a  major  goal  of  our  project  is  to  out¬ 
source  computations  expressed  in  a  restricted  subset  of  a 
general-purpose  programming  language,  like  C.  A  start¬ 
ing  point  for  our  work  (in  progress)  in  this  area  is  the 
innovative  compiler  module  in  the  Fairplay  [27]  project; 
it  uses  SSA  to  produce  efficient  circuits  from  a  subset 
of  C.  (Note  that  the  tremendous  inefficiencies  in  Fair- 
play  come  from  the  oblivious  transfer  protocol  and  con- 


Metric  Count 

Mult  <  7((f  +  48)  log3  n  +  (f  +  20)  log n  +  §  +  16) 

Add  <  7(36  log4  n  +  (f  +  36)  log2  n  +  §  +20) 

Hash  <  7(8641og2»  +  (‘f>+240)log«  +  12io|Si|7i+(f  +8)) 
N/W  <  7((432+f +864.s)log2n+(^+240.v+120)logn+ 

(f+96)iogg7,+(f  +  192))  +  f  +  » _ 

Figure  2 — Upper  bounds  on  the  verifier’s  cost — in  terms  of 
number  of  multiplication,  addition,  and  collision-resistant  hash 
operations,  and  bits  transferred  over  the  network  between  the 
verifier  and  the  prover — for  a  Boolean  formula  with  n  literals. 
8  is  set  to  10~4  to  get  the  error  probability  described  in  §4.1,  s  is 
the  size  in  bits  of  a  collision-resistant  hash,  and  i  and  o  are  the 
respective  sizes  in  bits  of  the  input  and  output.  The  functions 
are  polylogarithmic  in  the  size  of  the  computation;  the  work 
done  by  the  verifier  and  the  network  resources  consumed  grow 
much  slower  than  the  size  of  the  computation. 

sequent  restrictions  on  optimization  of  the  circuits.)  We 
hope  to  concurrently  refine  our  protocol  and  the  compiler 
to  extend  our  techniques  to  programs  of  reasonable  size. 

4  Analysis  and  suitability 

This  section  gives  a  detailed  analysis  of  the  cost  to  the 
verifier  of  executing  our  protocol,  focusing  on  the  run¬ 
ning  example  of  matrix  multiplication  over  a  field.  We 
answer  two  high  level  questions  here:  (1)  What  are  the 
verification  costs  (§4.1)?  (2)  What  class  of  computations 
are  likely  to  result  in  a  cheaper  verification  relative  to  just 
executing  the  computation  locally  (§4.2)? 

4.1  Analysis 

We  first  estimate  the  protocol’s  costs  in  terms  of  n,  the 
number  of  literals  in  (f>P ,  and  then  apply  these  estimates 
to  a  specific  computation  and  accompanying  Boolean 
formula:  matrix  multiplication.  We  count  the  operations 
performed  by  the  verifier  and  determine  the  network 
costs.  (We  do  not  include  the  costs  of  compiling  or  stor¬ 
ing  (j)P ,  which  we  expect  to  be  amortized.)  Exact  counts 
are  in  Appendix  D;  simpler  loose  upper  bounds  are  in 
Figure  2.  All  of  these  costs  are  polylogarithmic  in  n,  so 
grow  much  slower  than  the  size  of  the  computation.  The 
counts  incorporate  a  term  7,  which  is  specified  so  that  the 
error  probability — the  probability  that  a  correctly  func¬ 
tioning  verifier  accepts  an  incorrect  output — is  at  most 
.  The  counts  also  include  hash  operations,  which  come 
from  the  constructions  described  in  §3.2. 

We  now  estimate  the  cost  of  verifying  the  computa¬ 
tion  of  “m  x  m  matrix  multiplication  with  32-bit  entries”. 
Call  this  computation  AT  To  apply  the  counts  above, 
we  must  determine  n  for  the  3-CNF  Boolean  formula 
ipM  ■  We  assume,  naively,  m3  32-bit  combinatorial  mul¬ 
tiplier  circuits  and  (m  —  1  )iir  32-bit  adder  circuits.  As 
derived  in  Appendix  A,  a  loose  upper  bound  on  n  is 
34784w3  +  1120 m2(m  —  1).  Taking  this  value  of  n  in 
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Figure  3 —  Relative  costs  of  performing  matrix  multiplication 
on  m  x  m  matrices.  B  is  Baseline,  O  is  our  scheme,  Mult  is 
Multiplication,  Add  is  addition.  Hash  is  the  number  of  hash 
operations  in  millions,  I/O  is  input  and  output  costs.  Note  that 
as  the  size  of  the  input  grows,  the  size  of  the  proof  sent  over  the 
network  becomes  insignificant  relative  to  the  input  size. 

the  exact  counts,  we  get  the  estimates  in  Figure  3,  for 
different  m.  (We  take  s  =  160  and  7  =  1;  to  drive  the 
error  probability  to  d-,  we  must  multiply  the  reported 
costs  by  7).  The  figure  also  compares  the  costs  to  the 
baseline  case  of  executing  the  computation  locally.  The 
savings  in  computation  are  enormous.  There  is  some  net¬ 
work  cost  relative  to  local  computation,  but,  as  the  matrix 
size  grows,  this  network  cost  is  dwarfed  by  the  network 
resources  required  to  send  the  input  and  receive  the  out¬ 
put.  Moreover,  any  outsourcing  scheme  would  have  to 
pay  this  input/output  network  cost. 

4.2  Suitability 

In  our  scheme,  the  cost  of  verification  is  polylogarithmic 
in  the  size  of  the  computation  (Figure  2)  and  the  con¬ 
stants  are  small.  Therefore,  a  verifier  using  our  scheme 
will  save  work  by  outsourcing  the  computation  when  the 
local  costs  grow  faster  than  this  (e.g.,  the  local  costs  are 
polynomial).  We  should  note  that  in  the  simplified  pro¬ 
tocol  we  study  here,  we  also  depend  on  a  compact  rep¬ 
resentation  as  a  Boolean  circuit.  As  discussed  in  the  next 
section,  in  future  work  we  plan  to  relax  this  requirement. 

However,  there  are  many  computations  that  meet  these 
requirements  already:  various  linear  algebra  operations, 
string  pattern  matching  (as  in  a  virus  checker),  context- 
free  parsing,  etc. 

5  Research  agenda 

In  this  section,  we  outline  our  program  for  producing  a 
practical  system  for  verifiable  outsourced  computation. 

Circuit  generation.  One  of  our  goals  is  to  work  with 
arbitrary  computations  expressed  in  C,  which  requires 
making  it  feasible  to  compile  such  computations  into 
concise  formulas.  §3.5  described  our  plans  here. 

Efficiency  for  the  prover.  We  have  so  far  focused  on 
the  verifier’s  efficiency.  The  computational  burden  on  the 
prover  is  heavier.  We  are  investigating  protocol  refine¬ 
ments  to  improve  this. 


Batching.  If  a  verifier  outsources  multiple  computa¬ 
tions  at  once,  then  the  verifier  can  save  significant  re¬ 
sources  by  batching :  the  prover  generates  a  single  proof 
for  all  the  computations,  instead  of  separate  proofs  for 
each.  We  are  investigating  modifying  the  verifier’s  proto¬ 
col  accordingly.  In  particular,  the  verifier  should  be  able 
to  use  the  formula  or  fingerprint  of  the  single  instance  of 
the  computation  to  obtain  the  fingerprint  of  the  batch. 

Using  more  efficient  PCP  constructions.  There  are 
constructions  in  the  literature  in  which  the  verifier  in¬ 
spects  0(  1)  bits  instead  of  0(log4  n)  [33]  (where  n  is  the 
number  of  literals  in  the  Boolean  formula). 

A  more  expansive  threat  model.  We  are  developing 
efficient  ways  to  remove  the  assumptions  we  made  about 
extracting  the  output  literals  from  the  proof  (§3.4),  al¬ 
lowing  us  to  operate  in  a  much  broader  threat  regime. 

A  Estimates  on  the  number  of  literals 

Here  we  describe  the  steps  we  took  to  estimate  the 
number  of  literals  in  a  32-bit  adder  and  a  32-bit  mul¬ 
tiplier  when  they  are  represented  in  the  form  of  a  3- 
CNF  Boolean  formula,  as  noted  in  §4.1.  We  examined  an 
adder  circuit  from  [23]  that  adds  two  1-bit  binary  num¬ 
bers.  We  then  calculated  the  number  of  literals  in  that 
circuit  after  converting  it  to  3-CNF  Boolean  formula  us¬ 
ing  the  procedure  from  [28].  We  found  this  count  to  be 
at  most  35  literals.  In  order  to  get  a  32-bit  adder,  a  naive 
way  is  to  use  32  1-bit  ripple  carry  adders.  Therefore,  a 
32-bit  adder  will  have  at  most  1120  (=  35  x  32)  literals. 

Next,  we  looked  at  a  4-bit  multiplier  circuit  from  [23] 
and  estimated  an  upper  bound  on  the  number  of  AND 
gates  and  adders  used  by  a  32-bit  multiplier.  A  naive  32- 
bit  multiplier  will  use  at  most  2  '  adders  and  2  x  32 

AND  gates.  We  then  estimated  the  number  of  literals  that 
would  be  present  in  the  3-CNF  Boolean  formula  rep¬ 
resentation  of  a  32-bit  multiplier.  This  count  is  at  most 
34784  (=  2  x  32  +  (2^  i)  x  35). 

B  Background 

Before  presenting  the  core  protocol,  we  describe  some  of 
the  tools  used  by  the  PCP  construction.  We  adapt  the  no¬ 
tation,  terminology,  and  content  here  from  [7].  Our  pro¬ 
tocol  uses  the  PCP  construction  from  [7]  i.e.,  the  con¬ 
struction  of  PCP((9(log  n),  (9(log4n))  [7],  In  this  con¬ 
struction,  the  verifier  uses  0(log  n)  bits  of  randomness 
and  examines  O(\og4  n)  bits  of  the  proof,  where  n  is  the 
number  of  Boolean  literals  in  the  3-CNF  formula  whose 
satisfiability  is  being  proven.  Our  notation  is  summarized 
in  Figure  4. 

B.l  Arithmetization 

Each  clause  in  a  Boolean  formula,  4>,  in  3-CNF  form, 
can  be  represented  by  a  polynomial  of  degree  3  over  a 
finite  field.  This  can  be  done  by  replacing  the  Boolean 
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Symbol  Meaning 

n  Number  of  Boolean  variables  (>  3) 

q  a  prime  number  (>  1 00 [ log4  n\) 

F  finite  field  =  {0,  1} 

H  subset  {0 . |//|  —  l}ofF(|//|  =  |"log«"|) 

k  number  of  variables  of  polynomial  ( [ 

Fjj;  set  of  k— variate  polynomials  of  degree  d  over  field  F. 

h  A  vector,  and  h,  denotes  the  i,h  component  of  h. 


Figure  4 — Notation  used  in  the  description  of  our  scheme 

V  with  the  field  multiplication.  Also,  a  Boolean  variable, 
u.  is  replaced  by  (1  —  xu)  and  a  negated  variable,  u,  is 
replaced  by  xu,  where  xu  is  a  variable  that  can  take  values 
from  the  finite  field. 

Each  clause  of  <j>  gets  converted  to  one  of  the  following 
polynomials  depending  on  the  number  of  negated  vari¬ 
ables.  Let  x,  y,  z£  F,  a  finite  field.  Now,  define: 

Po(x,y,z)  =  (1  -x)(l  —  y)(l  -  z ) 

Pi{x,y,z )  =x(l  —  y)(l  - z ) 

Pi{x,y,z)  =xy(l  -z) 

p3(x,y,z)  =  xyz 

A  clause  is  said  to  be  type  j,  if  the  clause  contains  j 
negated  variables.  Assume  that  the  negated  variables  al¬ 
ways  appear  at  the  beginning  of  the  clause,  and  the  liter¬ 
als  in  the  increasing  order  of  their  indices.  The  remaining 
variables  appear  after  the  negative  variables  but  in  the  in¬ 
creasing  order  of  their  indices. 

Let  x^,  for  j  =  0,1,2, 3  be  the  four  clause- 

characteristic  Boolean  functions  such  that  x^(0 ,  h,  h)  = 
1,  if  and  only  if  <f>  contains  a  clause  of  type  /'  with  vari¬ 
ables  n,j,  n,-2,  and  n,-3. 

Now  if  there  exists  a  vector,  a,  then  to  check  if  a 
satisfies  the  Boolean  formula,  one  has  to  check  if 
the  following  four  functions  are  identically  zero  for  all 
i\,  «2,  h  €  {1, ...,  n}  and  j  =  0, 1, 2,  3: 

4(*t,  h,  h)  =  h,  i3)pj(ah,ai2,  ah)  =  0 

B.2  Zero-tester  polynomials 

The  problem  of  verifying  whether  a  polynomial  is  zero  at 
every  point  in  a  finite  field  H3k  can  be  checked  efficiently 
by  using  zero-tester  polynomials  [7] .  This  can  be  done  by 
checking  whether  the  polynomial  in  consideration  mul¬ 
tiplied  by  a  zero-tester  polynomial  (chosen  uniformly  at 
random  from  a  set  of  zero-testers)  sums  to  zero  in  Hik. 
Here  we  describe  an  example  of  a  family  of  zero-testers 
that  is  used  by  our  scheme. 

Lor  any  b  £  Fkk ,  define, 

h£H 


R^(x  i,...,x3k)  =  n  ]^Ib.{xi) 

Now,  the  set  of  polynomials,  is  a  family  of 

zero-testers  in  F3k\H\M  [7], 

B. 3  Selector  polynomials 

Selector  polynomials  are  useful  for  constructing  low- 
degree  extensions  of  functions.  We  define  two  of  them 
here  (an  univariate  and  a  multivariate  polynomial).  Lor 
any  w  £  //, 

Sw(z)  =  n ) 

w  —  y 

Note  that  S ^  (w)  =  1  and  S (x)  =  0  for  any  x  £  //, 
and  x  vv. 

Lor  any  h  £Hk, 

s^x)  =  n  UsUxt) 

Note  that  .S'|-(  h)  =  1  and  Sjj(x)  =  0  for  any  x  £  H,  and 
x^h. 

C  Verified  computation  scheme 

Here  we  provide  a  complete  description  of  our  proto¬ 
col  mentioned  in  §3.  In  our  scheme,  the  outsourcing  of 
a  computation  proceeds  in  two  steps  in  the.  In  the  first 
step,  called  the  compilation  of  the  computation ,  the  com¬ 
putation  in  a  high  level  language  is  converted  into  3-CNL 
Boolean  formula  representation  using  a  compiler.  In  the 
second  step,  called  the  execution  of  the  computation,  the 
computation  expressed  in  3-CNL  Boolean  formula  is  ex¬ 
ecuted  to  generate  the  output. 

In  the  protocol  that  we  are  going  to  describe,  we  as¬ 
sume  that  the  verifier  performs  the  compilation  step  and 
outsources  the  compiled  Boolean  formula  to  the  prover. 
We  begin  by  describing  the  computation  performed  by 
the  prover.  We  then  describe  the  algorithms  used  by  the 
verifier  to  verify  the  proof  generated  by  the  prover. 

C. l  The  prover’s  protocol 

The  protocol  that  we  describe  here  is  for  a  correctly 
functioning  prover.  A  malfunctioning  prover  can  deviate 
from  this  protocol  arbitrarily. 

At  a  high  level,  the  prover  obtains  the  verifier’s  com¬ 
putation,  </>,  in  the  form  of  a  3-CNL  Boolean  formula. 
The  verifier  also  specifies  the  input  to  the  computation 
i.e.,  values  for  some  of  the  variables  in  the  Boolean  for¬ 
mula.  Let  <j)  contain  n  variables.  The  prover  first  finds  a 
satisfying  assignment  to  </>,  after  assigning  the  verifier- 
supplied  values  to  the  corresponding  variables  of  f. 

Once  the  prover  finds  the  satisfying  assignment,  it  en¬ 
codes  the  satisfying  assignment  in  a  low-degree  polyno¬ 
mial  over  a  finite  field.  The  prover  also  constructs  other 
polynomials  (partial-sum  polynomials  and  a  line  table) 
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Metric  Count 


Mult  <  log  (k\H\  +  l)  +  f  +  4((9*|ff|  +  2)  log  (3*|tf|  +  1)  +  3(k\H\  +  1)  log  (k\H\  +  1)  +  4)) 

Add  <  7(36A~|//|2  +  (f  +  36)k\H\  +  § k  +  20) 

Hash  <  7((y  +  24)k  log  q  +  12k1  log;/  +  12£  +  (^  +  8)) 

NA¥  <  7((  y  +  24)fa  log  q  +  ( ^  +  24)  log  q  +  12k1  s  log  q  +  | k\H\  log  q  +  36 Ir  |//|  log  q+  12 k  log  q)  +  i  +  o 

Figure  5 — Upper  bounds  on  verifier’s  cost  in  terms  of  the  count  of  computations:  Mult  (multiplication).  Add  (addition).  Hash 
(collision-resistant  hash),  and  NAV  (bits  transferred  over  the  network  between  the  verifier  and  the  prover)  for  verifying  an  outsourced 
computation  with  a  Boolean  formula  consisting  of  n  literals.  8  is  set  to  10~4  to  get  the  error  probability  described  in  §4.1.  The 
functions  are  polylogarithmic  in  the  size  of  the  computation;  the  work  done  by  the  verifier  and  the  network  resources  consumed 
grow  much  slower  than  the  size  of  the  computation. 


based  on  the  assignment  and  the  formula,  (f).  The  prover 
evaluates  these  polynomials  at  all  possible  points  in  their 
domain.  Once  the  prover  evaluates  these  polynomials,  it 
sends  commitments  about  them  to  the  verifier.  This  is 
done  by  constructing  Merkle  tree(s)  whose  leaf  nodes 
contain  the  evaluated  values  and  sending  the  root(s)  of 
the  Merkle  tree(s)  to  the  verifier. 

Later,  when  the  verifier  starts  the  verification  protocol, 
the  prover  has  to  return  the  requested  leaf  nodes  from 
the  Merkle  tree(s).  The  prover  also  provides  intermediate 
hashes  in  the  Merkle  tree  starting  from  the  requested  leaf 
to  the  root  of  the  Merkle  tree.  These  hashes  enable  the 
verifier  to  check  if  the  returned  leaf  node  is  part  of  the 
Merkle  tree  that  the  prover  committed  to.  The  commit¬ 
ments  sent  by  the  prover  disallows  the  prover  to  change 
the  proof  generated  in  response  to  verifier’s  queries. 

(1)  Encoding  the  assignment  vector  as  a  low-degree 

polynomial.  Let  a  be  an  assignment  vector,  a  is  a  se¬ 
quence  of  n  bits.  From  the  notation  described  in  Figure  4 
and  the  values  specified  for  them,  \Hk\  >  n.  Therefore  a 
k— tuple  containing  elements  from  H  can  uniquely  iden¬ 
tify  a  component  in  the  assignment  vector.  Therefore  the 
assignment  vector,  a  can  be  thought  of  as  a  function  from 
Hk  to  {0, 1}.  Let  this  function  be/„. 

The  function  fa  is  then  converted  to  a  low-degree  poly¬ 
nomial  over  a  finite  field,  F  by  using  the  following  set 
of  transformations.  (When  a  function  from  Hk  to  {0, 1} 
is  transformed  to  a  low-degree  polynomial  over  a  finite 
field,  F,  its  a  k  variate  polynomial  with  degree  at  most 
k\H\.)  Let pf  £  Fk\H\,k  be  the  low-degree  polynomial.  For 
any  x  £  Fk, 

P/(x)  =  Sh  (*)/«(h) 

he/T 

Since  pf  £  Fk\H\  k,  it  can  be  represented  by  using  qk 
words  each  of  length  [log#]  bits. 

(2)  Constructing  the  line  table.  The  verifier  would 
need  some  auxiliary  information  to  use  the  low-degree 
encoding  of  the  assignment  vector.  It  is  called  a  line  ta¬ 
ble.  A  line  table  describes  the  “restriction"  of  the  low- 
degree  polynomial  on  all  “lines”  of  Fk  [7].  To  construct 
the  line  table,  the  prover  performs  the  following  step:  for 


each,  b  and  s  £  Fk  and  x  £  F,  the  prover  finds  (k\H\  +  1) 
coefficients  of  Pg-  where,  /Jg  -(x)  =  pf  ( b  +  sx).  Thus  a 
line  table  is  a  function  from  Flk  to  Fk\H l+1  (i.e.,  for  all 
b  and  s  £  Fk,  there  is  an  entry  in  the  line  table).  This 
line  table  can  be  represented  by  using  q2k  words  each  of 
length  (k\H\  +  1 ) [log q]  bits. 

(3)  Constructing  the  partial-sum  polynomial  tables 

The  prover  also  needs  to  provide  more  auxiliary  informa¬ 
tion  about  the  assignment  vector  and  the  computation.  In 
particular,  the  auxiliary  information  enables  the  verifier 
to  check  if  the  satisfying  assignment  found  by  the  prover 
actually  satisfies  the  Boolean  formula  representation  of 
the  computation.  To  this  aim,  the  prover  arithmetizes  the 
Boolean  formula,  as  described  in  Appendix  B.l. 

Let  /]  be  the  Boolean  function  that  describes  (!)  (defini¬ 
tion  for  //,  can  be  found  in  Appendix  B.l).  It  is  a  function 
from  Hik  to  {0, 1}.  Now, can  be  encoded  as  a  low- 
degree  polynomial  in  F^H^k  similar  to  the  low -degree 
extension  of  the  assignment  vector.  Let  gk  £  Fik \H\k  be 
the  encoded  polynomial.  For  b,  x  £  Fkk,  let  rL(x)  = 

The  goal  is  to  construct  polynomials  that  help  the  ver¬ 
ifier  to  check  if  J2heH3k  =  0  for  j  =  0, 1, 2, 3  and 

a  randomly  chosen  b  €  F3k  (i.e.,  for  a  randomly  chosen 
zero-tester  polynomial).  Note  that  this  condition  is  equiv¬ 
alent  to  the  condition  for  Boolean  satisfiability  described 
in  Appendix  B.L 

In  order  to  enable  efficient  verification  of  the  above 
condition  by  the  verifier,  the  prover  constructs  partial- 
sum  polynomials  for  every  b  €  Fv< ,  and  j  =  0, 1, 2,  3.  It 
is  done  using  the  following  definition. 

For  i  =  1, ...,  3 k,  the  ith  partial-sum  polynomial  gL  ;  F' 
— >  F,  is  defined  as  follows: 

4(xi>  •”x') 

=  ^{x\—^hyi+uyi+2-,yik) 

yi+i&H  yl+2£H  y3keH 

Now,  the  prover  creates  four  tables,  one  for  each 
j  =  0, 1,2,3.  Table  7}  contains,  for  all  b  £  Fik,  for 


Algorithm  1  Low-degree  test  (adapted  from  [7]) 

1.  Input:  A  variate  polynomial,  pf,  and  line-table, 
Ta. 

2.  Output:  True  if  pf  is  5— close  to  a  polynomial  in 
Fk\H\,k 

3.  Set  5  <  1(T4 

4.  repeat 

5.  Select  b,  s  G  Fk  and  t  £  F  {define  Pg-(f)  = 

t|-1T0(b,s)i,  where  rfl(b, s),-  represents 
the  i'h  element  of  the  (k\H\  + 1  {—dimensional  vec¬ 
tor,  Ta{ b,s)  G  Ta'IhI+1} 

6.  if  (/\-(f)  ^  pf(b  +  St))  then 

7.  return  False 

8.  end  if 

9.  until  [I]  times 
10.  return  True 


i  =  1, ...,  3k,  and  for  all  c\, ...,  c,_i  G  F,  the  coefficients 
of  the  univariate  polynomials  gL  (c\, ...,  c,_i,x). 

C.2  The  verifier’s  protocol 

The  verifier  needs  to  know  the  3-CNF  Boolean  formula, 
c t>  from  which  it  can  obtain  low-degree  extension  of  the 
function,  for  j  =  0, 1, 2,  3  or  it  knows  yT ,  for  j  = 
0, 1, 2,  3  from  a  trusted  source. 

At  a  high  level,  the  verifier  needs  to  perform  the  fol¬ 
lowing  steps: 

V 1  Check  if  the  purported  low-degree  polynomial  encod¬ 
ing  of  the  assignment  is  actually  a  low-degree  polyno¬ 
mial. 

V2  Check  if  the  purported  assignment  contained  in  the 
low-degree  polynomial  actually  satisfies  the  Boolean 
formula  representation  of  the  computation. 

V3  Check  if  the  prover  assigned  the  supplied  input  to  the 
input  variables 

V4  Extract  the  output  from  the  low-degree  encoding  of 
the  assignment. 

We  now  describes  these  high  level  steps  in  more  detail. 

(1)  Low-degree  test.  First,  the  verifier  checks  if  the  pur¬ 
ported  low-degree  polynomial  constructed  by  the  prover 
is  indeed  a  low-degree  polynomial  of  degree  k\H\.  This 
check  is  performed  by  the  verifier  by  reading  a  constant 
number  of  words  from  the  low-degree  extension  of  the 
assignment  and  the  line  table  (obtained  by  interactively 
querying  the  prover.  The  verifier  also  checks  if  the  re¬ 
turned  leaf  nodes  of  the  Merkle  tree  was  present  in  the 
committed  Merkle  tree  at  the  beginning  of  the  verifica¬ 
tion  protocol).  Algorithm  1  sketches  this  test.  Once  this 
algorithm  returns  TRUE,  the  verifier  is  convinced  that  the 
polynomial  constructed  by  the  prover  is  5— close  to  a 
polynomial  of  degree  k\H\. 


Algorithm  2  Correcting  a  low-degree  polynomial 
(adapted  from  [7]) 

1.  Input:  x  G  Fk,  a  variate  polynomial,  pt-  ( pf  is  S- 
close  to  a  polynomial,  p  G  F^H |_*)  and  line-table,  Ta. 

2.  Output:  p(x) 

3.  Choose  a  random  s  G  Fk 

4.  Choose  a  random  t  G  F 

5.  if  (Ps,s(t)  ^  Pf(x  +  St))  then 

6.  return  False 

7.  else 

8.  return  Ps  s(0) 

9.  end  if 


Algorithm  3  Sum-check  test  (adapted  from  [7]) 

1.  Input:  G  F3k\H\M,  ft|,  G  F:u\H\3k  and  a  table  7) 

of  partial-sum  polynomials 

2.  Output:  True  if  the  product  of  g1^  and  Rg  sum  to  0 
in  H3k. 

3.  Read  the  coefficients  of  gL  ^  (x) 

4-  if^E,e/r4.iW  ^0)then 

5.  return  False 

6.  end  if 

7.  Randomly  choose  /,  G  F  for  i  =  1, ...,  3k 

8.  for  i  =  2  to  3k  do 

9.  Read  the  coefficients  of  gL  (li, ...,  /,_ i,x) 

1°-  if  (E. xen ^  4,u_i)(/'-l))  the" 

11.  return  False 

12.  end  if 

13.  end  for 

14.  if  (R^(h,...,hk)  x^(lu...,hk)  ^  g’tuihk^then 

15.  return  False 

16.  else 

17.  return  True 

18.  end  if 


(2)  Sum-check  test.  Next,  the  verifier  checks  if  the  as¬ 
signment  encoded  by  the  low-degree  polynomial  actually 
satisfies  the  Boolean  formula,  ()>.  To  this  aim,  the  verifier 
queries  the  partial-sum  polynomial  tables  constructed  by 
the  prover.  This  check  is  performed  by  using  the  sum- 
check  test. 

In  the  sum-check  test,  the  verifier  first  checks  if  the 
partial-sum  polynomial  tables  constructed  by  the  prover 
are  “consistent”  with  one  another  at  a  randomly  chosen 
location.  Later,  the  verifier  chooses  a  zero-tester,  uni¬ 
formly  at  random,  from  the  family  of  zero-testers  (the 
family  that  we  described  in  Appendix  B.2).  Since  the  ver¬ 
ifier  either  knows  the  Boolean  formula  or  the  low-degree 
extension  of  yf ,  it  can  compute  g[:.  at  any  randomly  cho¬ 
sen  location  (Definition  of  in  Appendix  C.l).  This  is 
done  by  using  three  words  from  the  low-degree  extension 
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of  the  assignment  and  the  value  evaluated  from  at  the 
chosen  random  location. 

The  sum-check  test  checks  if  the  product  of  the  chosen 
zero-tester  and  at  the  chosen  random  location  match 
the  value  generated  by  the  proven  The  sum-check  test  is 
performed  once  for  each  j  =  0, 1,2,  3,  and  is  described 
in  Algorithm  3.  Once  the  sum-check  test  passes,  the  ver¬ 
ifier  is  convinced  that  the  assignment  contained  in  the 
low-degree  polynomial  satisfies  <j).  (Since  the  verifier  is 
querying  a  polynomial  that  is  only  6— close  to  the  real 
polynomial,  it  applies  the  correcting  procedure  described 
in  Algorithm  2  when  reading  from  the  low-degree  poly¬ 
nomial.) 

(3)  Check  input  and  extract  output.  Next,  the  verifier 
needs  to  check  if  the  input  variables  of  the  purported  as¬ 
signment  contain  the  verifier-supplied  input.  This  can  be 
done  by  querying  the  low-degree  polynomial  encoding 
of  the  assignment,  and  check  if  the  input  variables  have 
the  right  values.  To  extract  the  output,  the  verifier  queries 
the  low-degree  polynomial  encoding  of  the  purported  as¬ 
signment  and  extracts  the  values  assigned  to  the  output 
variables. 

D  Analysis 

As  noted  in  §4.1,  we  present  a  tight  upper  bound  on  the 
count  of  operations  performed  by  a  verifier  for  verify¬ 
ing  a  computation.  These  counts  are  expressed  as  func¬ 
tions  of  n,  the  number  of  literals  in  the  3-CNF  Boolean 
formula  representation  of  the  computation.  We  stepped 
through  each  step  of  Algorithm  1,  2,  3,  and  counted  the 
number  of  multiplications,  additions,  hash  computations 
perfomed  by  the  verifier.  We  also  counted  the  number  of 
bits  transferred  between  the  verifier  and  the  prover  for 
these  Algorithms.  These  counts  are  function  of  7,  q ,  \H\, 
s,  and  k,  which  are  in  turn  functions  of  n  (as  shown  in 
Figure  4).  Figure  5  shows  these  functions.  Although  not 
expressed  directly  in  terms  of  n,  they  are  all  polylogarith- 
mic  in  n. 
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